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APPARATUS FOR ANALYSING CARDIAC EVENTS 
Technical field 

The present invention relates to an apparatus for analysing cardiac events 
5 detected in electrograms, EGMs, and to a heart stimulator provided with such an 
apparatus. 

In the following the expression cardiac event denotes the depolarization 
phase in the cardiac cycle which for atrial signals is commonly known as P wave 
and for ventricular signals as R wave or QRS complex. 

10 

Background 

In the field of devices for cardiac rhythm management (CRM), accurate 
rhythm classification is an increasingly important aspect. Pacemakers are primarily 
used to assist in bradycardia or when the electrical propagation path is blocked, 

1 5 whereas the primary use of implantable cardioverter defibrillators (ICD) is to ter- 
minate ventricular arrhythmia, a life-threatening condition if not acutely treated. In 
both types of devices, accurate event classification of the electrogram signal is 
needed for identifying, e.g., atrial and ventricular fibrillation in order to give appro- 
priate therapy for the detected arrhythmia. For pacemakers, this may imply 

20 changing the pacing mode in order to stabilize the ventricular rhythm during an 
episode of atrial fibrillation. An ICD responds to ventricular fibrillation by giving a 
defibrillatory shock intended to terminate the fibrillation. 

Ever increasing demands are put on both kinds of devices to better hand- 
le their primary task as well as to manage other tasks than those originally inten- 

25 ded for. One such task may, for an implantable medical device, be to identify atrial 
flutter in order to terminate it by atrial pacing or to defibrillate atrial fibrillation. 
Although it is not a life-threatening arrhythmia, atrial fibrillation is an inconvenience 
to the patient and increases the risk for other diseases such as stroke. Atrial 
pacing may also be one way of terminating supraventricular tachycardias. An ICD 

30 specific task is to identify atrial fibrillation in order to not mistake it for ventricular 
fibrillation and the risk of giving an unnecessary, and possibly harmful, 
defibrillation shock. Another, more general, utilization is to efficiently store rhythm 
data for later analysis and evaluation, already done in modem ICD's., By collecting 
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data, better knowledge of the evolution of cardiac diseases and the functionality of 
the device can be obtained. 

Clustering represents an important task within the classification problem 
where each individual event is assigned to a cluster of events with similar features. 
5 Labelling of the clusters, i.e., associating the cluster with a specific cardiac rhythm, 
completes the classification such that the device can provide proper therapy when 
needed. However, certain constraints distinguish clustering in CRM devices from 
clustering in general. In order to give immediate therapy, it requires clustering to 
be done in real-time, thus excluding many iterative clustering algorithms such as k- 

10 means clustering and competitive learning. Various methods have recently been 
presented concerning clustering of signals from the surface electrocardiogram 
(ECG), based on, e.g., self-organizing maps or fuzzy hybrid neural networks, see 
M. Lagerholm, C. Peterson, G. Braccini, L. Edenbrabdt, and L. Sornmo, Clustering 
ECG complexes using Hermite functions and self-organizing maps", IEEE Trans. 

15 Biomed. Eng., vol. 47, pp. 838 -848, July 2000, and S. Osowski and T. Linh, "ECG 
beat recognition using fuzzy hybrid neural network", IEEE Trans. Biomed. Eng., 
vol. 48, pp. 1265-1271, November 2001. However, most clustering algorithms 
used for ECG analysis are computationally rather complex and therefore unsuitab- 
le for implantable CRM devices. Furthermore, not much a priori morphologic infor- 

20 mation is associated with the various rhythms in the electrogram (EGM); this is in 
contrast to the more well-defined ECG. 

Previously presented work in the area of intracardiac event classification 
mainly focus on discrimination of a specific condition in order to discern, e.g., atrial 
fibrillation from other atrial tachyarrythmias, see A. Schoenwald, A. Sahakian, and 

25 S. Swiryn, "Discrimination of atrial fibrillation from regular atrial rhythms by spatial 
precision of local activation direction", IEEE Trans. Biomed. Eng., vol. 44, pp. 958 
- 963, October 1997. Other applications involve discrimination of ventricular from 
supraventricular tachycardia, see L. Koyrakh, J. Gillberg, and N. Wood, "Wavelet 
based algorithms for EGM morphology discrimination for implantable ICDs", in 

30 Proc. Of Comp. In Card. (Piscataway, NJ, USA), pp. 343 - 346, IEEE, IEEE Press, 
1999, and G.Gronefeld, B. Schulte, S. Hohnloser, H. - J. Trappe, T. Korte, C. 
Stellbrink, W. Jung, M. Meesmann, D. Bocker, D. Grosse - Meininghaus, J. Vogt, 
and J. Neuzner, " Morphology discrimination: A beat-to-beat algorithm for the 
discrimination of ventricular from supraventricular tachycardia by implantable 
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cardioverter defibrillators", J. Pacing Clin. Electrophysiol., vol. 24, pp. 1519 - 1524, 
October 2001. More general classification algorithms, which in turn involve training 
on individual patients, have been based on analogue neural networks or wavelet 
analysis for morphologic discrimination of arrhythmias. 
5 WO 97 39 681 describes a defibrillator control system comprising a 

pattern recognition system. The intracardiac electrogram signal is digitised and 
delivered for feature selection into a selector. The feature selector outputs 
selected features to a trained classifier to provide information as to what group the 
produced signal should be clustered, e.g. ventricular tachycardia. The classifier 

10 outputs the classified information for use for a therapeutic decision. 

US 5 271 411 discloses an ECG signal analysis and cardiac 
arrhythmia detection by extraction of features from a scalar signal. A QRS pattern 
vector is then transformed into features describing the QRS morphology, viz. a 
QRS feature vector. A normal QRS complex is identified based on the population 

15 of QRS complexes located within clusters of QRS features within a feature space 
having a number of dimensions equal to the number of extracted features. The 
extracted morphology information is then used forjudging whether a heart beat is 
normal or abnormal. 

US 5 638 823 describes non-invasively detecting of coronary artery 

20 diseases. A wavelet transform is performed on an acoustic signal representing one 
or more sound event caused by turbulence of blood flowing in an artery to provide 
parameters for a feature vector. This feature vector is used as one input to neural 
networks, the outputs of which represent a diagnosis of coronary stenosis in a 
patient. 

25 In Michael A. Unser et al, "Wavelet Applications in Signal and Image 

Processing IV", Proceedings SPIE - The International Society for Optical 
Engineering, 6-9 August 1996, vol. 2825, part two of two parts, pp. 812-821, a 
wavelet packet based compression scheme for single lead ECGs is disclosed, 
including QRS clustering and grouping of heart beats of similar structures. For 

30 each heart beat detected, its QRS complex is compared to templates of previously 
established groups. Point-by-point differences are used as similarity measures. 
The current beat is assigned to the group whose template is most similar, provided 
predetermined conditions are satisfied. Otherwise a new group is created with the 
current QRS complex used as the initial group template. 
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Future pacemakers and ICD's will include more advanced means for 
arrhythmia detection and therapy, and the purpose of the present invention is to 
propose a technique for separation of cardiac rhythms in a reliable way on the 
5 basis of electrogram event clustering 

Disclosure of the invention 

The above purpose is obtained by an apparatus according to claim 1 . 
Certain arrhythmias are diagnosed immediately to give proper therapy, 

10 while, for others, it may be sufficient to record the rhythm for data collection 
purposes. Thus, the classification problem, viz. to label the rhythms based on 
clusters using clinical terms, may not always be necessary to implement. 

According to advantageous embodiments of the apparatus according to 
the invention both morphologic and temporal data are considered for clustering. 

1 5 Morphologic features are efficiently extracted by use of the dyadic wavelet 
transform after which the events are grouped by a leader-follower clustering 
embodiment. The event detection problem, based on the same transform, is 
previously treated in Swedish patent application no. 0103562-5. 

According to another advantageous embodiment of the apparatus 

20 according to the invention an integrating means is provided to integrate said 
distance over a predetermined period of time. By integrating the distance over a 
period of time it is possible to distinguish irregular rhythms, like atrial fibrillation, 
from regular rhythms. The integral total distance in case of atrial fibrillation will be 
high whereas regular rhythms will result in a lower total distance. 

25 The invention also relates to a heart stimulator provided with the above 

mentioned apparatus for controlling the therapeutic stimulation depending on 
arrhythmia detection. 

Brief description of the drawings 
30 To explain the invention in greater detail embodiments of the invention, 

chosen as examples, will now be described with reference to the enclosed 
drawings, on which figure 1 shows impulse responses of a filter bank used for 
cardiac event detection, figure 2 is a flow chart of clustering algorithm perform with 
the apparatus according to the invention, figure 3 illustrates the computational 
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complexity for different clustering algorithms, figure 4 presents in (a) clustering 
performance in terms of probability of a correct clustering of an event, Pec and 
probability of a dominant event in a cluster P D e and in (b) the spread in the number 
of clusters, figure 5 illustrates in (a) clustering performance in terms of P D e and 
5 Pec as a function of a distance threshold n and in (b) the numbers of clusters as a 
function of n, figure 6 exemplifies the clustering performance for one set of para- 
meters, figure 7 exemplifies clustering result on a concatenated EGM for three 
cases, figure 8 is a flow chart illustrating the function in broad outline of an 
embodiment of the apparatus according to the invention, and figure 9 is a block 
1 0 diagram of an exemplifying embodiment of a heart stimulator provided with an 
apparatus according to the invention. 

Description of preferred embodiments 
P wave detection and feature extraction 

15 A signal model assuming that the event waveform is composed of a linear 

combination of representative signals is considered. The feature extraction 
problem is then to estimate the individual components of the representative 
signals since each morphology will have its own linear combination. By using the 
dyadic wavelet transform, different widths of the two fundamental monophasic and 

20 biphasic waveforms are included in the model at a low cost. 

Feature extraction 

It is assumed that the QRS waveform is composed of a linear combination 
of representative signals, 



that it must have full rank. Different morphologies, s(n), are modelled by the P x 1 
coefficient vector 9(n), with the linear model 



30 where n is a temporal variable describing when the event occurs. The observed 
signal, x(n), is assumed to be modelled by, 



25 



H = [ hi hp] 

where each function, hj, j= 1, 



(2) • 

, P, is of size M x 1. The only restriction on H is 



s(n) =H0(/7) 



(3) 



x(n) =H9(n) + w(n) 



(4) 



0 {, . 
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where the M x 1 noise vector w(n) is assumed to be zero-mean, white, and 
Gaussian with variance c; . Consequently, x(n) is defined as 

x(n) 

x(n)= (5) 
x(n + M-1)_ 

implying an event duration of M samples, beginning at n. The probability density 
function x(n) for a specific realization of 9(n), p(x(n); 9(n)), is thus given by 



p(x(n)9(n)): 



1 



exp 



__• (x(n)-He(n)) T (x(n)-H9(n)) 

2(T 

CO 



(6) 



In this model, a complete description of a QRS complex is provided by the 
deterministic unknown parameter vector 0(n). The absence of a QRS complex 
10 corresponds to the case when 0(n) is equal to 0, where 0 is the P x 1 zero vector. 
In general, no a priori knowledge is available on 0(n), and therefore an estimate is 
required before detection can take place. Furthermore, only one event is assumed 
to take place within the observation interval 0 < n < N. 

1 5 Filterbank representation 

The descriptive functions in H have been selected such that the following 
three aspects have been taken into particular account: 

1. the main morphologies of the QRS complex are mono- and/or biphasic 

2. the broad range of QRS complex durations, and 
20 3. a low complexity implementation. 

The wavelet transform is particularly suitable since it is a local transform, 
i.e., it provides information about the local behaviour of a signal. One wavelet 
decomposition method which may be efficiently implemented is the dyadic wavelet 
transform. By careful selection of the filters, a suitable filter bank including mono- 
25 and biphasic impulse responses can be obtained. A symmetric lowpass filter, f(n), 
is used repeatedly in order to achieve proper frequency bands. This filter is 
combined with one of two filters, g b (n) or gm(n), which together define the 
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waveform morphology (the subindices b and m denote bi- and monophasic, 
respectively). For the biphasic case, the recursion is expressed as, 

hi.b(n)= g b (n) 

h 2 ,b(n)=f(n)*g b (2n) 
5 h 3lb (n)=f(n)*f(2n)*g b (4n) 

h q max .b < n ) = f M * - * j( 2C?max ' 3 n )* 9b (2 W " 1 n) (7) 



in which the subindex q max represents the maximum (coarsest) scale. 
10 It is now possible to present an expression forH which is composed of 

one biphasic and one monophasic part, 



H = lS b H m J 



(8) 



15 



where the biphasic H b is defined as 



H b-[ h q min ,b-h qmax|b ]= 



hq min ,b(0) - hq max ,b(0) " 

L h q mi n,b(M-i)...h qmaxib (M-i) 



0) 



where the subindex q m j n represents the minimum scale and q min < q max . The 
20 monophasic matrix H m is computed in a corresponding way. The reversed order of 
the columns in A, denoted with H in (8), is introduced in order to be consistent with 
the model assumed in (4). 

In order to mimic the desired mono- and biphasic waveforms with a 
low complexity filter bank structure, short filters with small integer coefficients were 
25 used. In (7), the impulse response f(n) was chosen as a third order binomial 
function, 

F(z) = (l + z^J— 1 + 3z~ 1 + 3z~ 2 + z~ 3 (1 0) 



where F(z) is the Z-transform of f(n). For the biphasic filter bank, the 
30 filter g b (n) was selected as the first order difference, 
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g b (n) = [-1 1] (11) 
The filter g m (n) was chosen such that a compromise between the 
requirement of a DC gain equal to zero and an approximately monophasic impulse 
response was achieved. A reduction of complexity results when g b (n) is reused, 
5 gm(n)=gb(n)*gb(n) = [1-2 1 ] (12) 

For this particular choice of g m (n) and by using Mallafs algorithm, it is 
possible to calculate both the biphasic and the monophasic filter output from each 
scale by using f(n) once and g b (n) twice. The filter bank impulse responses are 
shown in Fig. 1. The filter bank includes two orthogonal signal sets where the 

1 0 width of the signal varies within each set. In (a), hj, b (n) is shown for j = 2, 4 

from top to bottom, and in (b), the corresponding hj, m (n) are shown. 
ML parameter estimation 

The unknown coefficient vector 0(n) can be estimated by using the 
maximum likelihood criterion according to, 

15 9(n)=argmaxp(x(n)e(n)) (13) 

e(n) 

However, 9 (n) is only of interest for those n for which the probability of an 
event, or, equivalently, for which the likelihood ratio test function T(x(n)) is 
maximized, 

n = argmaxT(x(n)) (14) 
n 

20 For this case, T (x(n)) can be shown to be [14], 

T(x(n))= x(n) T H^ T H)r 1 H T x(n) (15) 
for the case when a* is assumed to be constant. 

The optimal estimate 9 for the detected event at n is thus expressed as 

9 = argmaxp(x(n)e(n)) (16) 
e(n) 

25 The MLE of 9(n) is found by maximizing p(x(n);9(fi), or equivalently 

minimizing the MSE, 

(x(n)-H0(n)) T (x(n)-He(n))=x(n) T x(n)+e(n) T ^ 17) 
Derivation with respect to x(n) yields the optimum 9 , 

9(n)= ^ T H)" 1 H T x(n)= ^I T H)" 1 H T x(n) (18) 
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By using the above formulation, it is also possible to derive a generalized 
likelihood ratiodetector based on (15). 



Rate 

5 The central parameter for classification of cardiac arrhythmias is the heart 

rate. Most arrhythmias are defined in terms of heart rate, although sometimes with 
rather fuzzy limits. Consequently, rate should be considered in order to improve 
performance. The RR interval, At, is defined as the duration between two 
consecutive events, 

10 Af t =(n t -/UT* O 9 ) 

where n k and n fc -i denote the occurrence times of the events, and T s denotes the 
sampling period. 

Leader-follower clustering 

1 5 The choice of leader-follower clustering is based on a number of features 

which makes it suitable for the present invention, viz. 
o on-line processing (non-iterative), and 

o self-learning, i.e., no a priori knowledge of the of clusters is needed. 
The starting point is the assumption that an event is present for which it 
20 should be decided whether it belongs to an already existing cluster or if a new 
cluster should be initiated. Since, no knowledge is a priori available on which 
rhythms or morphologies to be expected, the chosen algorithm must be self- 
learning. The leader-follower clustering algorithm is constituted by four quantities: 

1. The event parameter vector, 0(n k ), containing the features of the k:th 
25 event that occurs at time n k . 

2. The cluster center, , and the covariance matrix Si that together define 
the i:th cluster. Since both \n and 2/ are unknown, they are replaced by their 
estimates jij and irrespectively. 

3. The metric, df , determine the distance between 0(n k ) and |a, 
30 according to some suitable function. 

4. A rule for adaptation of the cluster parameters for the winning cluster. 
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Initialization of new clusters 

During run-time, a finite number of clusters exists which represent the 
rhythms having appeared until present time. Thus, it is occasionally necessary to 
initialize a new cluster when the existing ones do not sufficiently well fit the present 
5 event. When the distance function di 2 exceeds a certain threshold, ti, it is more 
likely that the event belongs to a new cluster than to any of the existing clusters. 
The selection of ti is a tradeoff between cluster size and cluster resolution in the 
sense that choosing a small ti will result in many clusters with few clustering 
errors. On the other hand, a large r\ results in few clusters but in more errors. 
10 The minimum distance between <t>(n) and \ij with respect to both i and 

n is 



d min=niinclf(l) i = l I I = n k -£,...,n k +^ (20) 

i,l z ^ 



15 over the search interval K. The corresponding minimum distance 

indices are found as 

[imin > n min ] = argmindf(l) (21) 

U 

The decision rule based on the comparison of dJJ and r| is expressed 

as r — 
20 >t\ ' Initialize new cluster 

<i\: Assign to winning cluster 

<pri: Update wining cluster (22) 



d . 2 

nun *\ 



where 0 < p <1 . When the upper relation in (22) holds, a new cluster is 
25 initialized by first increasing the number of clusters by one, 1 = 1 + 1, given that 

I < Imax where l max is the maximum number of clusters, and then initializing the new 
cluster as, 

Pl=3>(n k ) (23) 
On the other hand, if I = Imax the algorithm needs to discard one of the 
30 existing clusters. This can be done by elimination of, e.g., the oldest or the 
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"smallest" cluster. By the term "smallest" cluster is meant that cluster which most 
rarely is fitted with a detected cardiac event. 

For the middle and lower conditions in (22), the existing minimum distance 
cluster / mln is selected as the winning cluster. However, only the lower condition 
5 results in a cluster parameter update. The reason for including such a distinction is 
that only closely similar events should be used for cluster updates in order to 
reduce contamination. 



Mahalanobis distance function 

10 The distance between the feature vector Q(n) and each cluster is de- 

fined as the Mahalanobis distance, which is a normalized Euclidean distance in 
the sense that it projects the parameter vector elements onto univariate dimen- 
sions by including the inverse covariance matrix If 1 . Thus, a feature with a larger 
variance in O(n) will be assigned a larger share of the hyperspace before norma- 

1 5 lization compared to that with a lower variance. A consequence of normalization is 
that the Mahalanobis distance works well on correlated data since If 1 then acts as 
a decorrelator. 

When searching for the minimum distance, a grid search over n is per- 
formed. This grid search is necessary since it not only minimizes the distance but 
20 also results in a more accurate fiducial point estimate than what would be the case 
when only considering T(x(n k )).The minimum distance is thus found by a grid 

search with respect to all existing clusters, i = 1, ,l, and all feature vectors 

within the duration of an event I as defined in (20), 

df(l) = WO-Pi) T Sr 1 N)-Pi) (24) 

25 

Reference feature adaptation 

In order to track changes in the features of the different rhythms, 

adaptation of both ji^ (k) and t, m "jjj (k) are desirable; here the event index k 
has been included for clarity. For jXfmin ( k )> an exponentially updated average is 
30 used: 

Pu( k ) = Pi mln ( k - 1 ) + Ye(k) (25) 



e o « 
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where 



<k) = 0(n min )-p= (k-1) 



(26) 



The exponential update factor y is confined to the interval 0 < y <1 . 

The inverse covariance matrix, ± ir ^ n (k), is estimated by exponential 
averaging of the new cluster difference matrix e(k) e T (k) using the update factor 
d-a), 



* -1 

r. (k)= 



-1 



'mm 



(a)Zi min (k-1) + (l-a>(k)s T (k) 



10 



Using the matrix inversion lemma 

A = B _1 + CD" 1 C T 

A" 1 = B - BC(D + C T BCy 1 C T B 



(28) 
(29) 



(27) 



15 and pairing the terms in (27) with the ones in (28), 



20 



A = 2 iinin (k) 

B- 1 atimin(k-1) 
C = e(k) 
D" 1 = (1 - a) 



(30) 

(31) 
(32) 
(33) 



The inverse ±. ir ~£ (k-1) in (27) may be computed without any matrix inversions as, 



£- 1 (k) = a-'Z-' (k-1)- 

■mln 'min 



-1^-1 



a- 1 Sr 1 (k-1)s(k)£ T (k)a- 1 2:r 1 

«min 'mm 



(34) 



(1-a)" 1 + s x (k)a" 1 E |"J n (k-1)s(k) 



25 



By utilizing the matrix inversion lemma, the computational complexity 
of the operation is reduced from 0(P 3 ) to 0(P 2 ). 
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Algorithm initialization 

In leader-follower clustering, clusters are initialized as they become 
needed. This feature is convenient since it does not automatically introduce any 
unused clusters. However, it also puts demands on the algorithm to be able to 
5 create new clusters when necessary, and also to terminate clusters either not 
used for long or with only a few events. Initially, the total number of clusters, I, is 
equal to one. The algorithm is initialized by assigning the parameter vector 
<t>(nj ), which maximizes the test statistic in (14) of the first event, to the first 
cluster, cf. (23), 

10 

fli-4><fii) (35) 



15 



For the general case, 0(n k )is composed of a subset of the representative 
functions together with the preceding RR-interval, 



0(n k ) = 



Qs(n k ) 

MM 

L *k . 



(36) 



Where 9 s (7c k ) is a subset of the most discriminating elements in 9 s (7i k ) . 
Note that an event is defined by its depolarization wave. Consequently, At 
is not included in the arrival time estimation of h k , but instead computed 

20 afterwards. The time continuous notation of At k is preferred since it results in a 
suitable magnitude similar to the normalized morphological information in 9 S . 

The inverse correlation matrix estimate is initialized in the same way for all 
clusters; a simple solution is to set it equal to a scaling of the identity matrix I, 



(37) 



25 where 5 is a design parameter. The complete clustering algorithm for 

organized events is presented in the flow chart in Fig. 2. 
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Reduced-complexity clustering 

In order to develop a more efficient algorithm in terms of performance 
versus power consumption and to evaluate the power consumption itself, a simple 
measure of the computational complexity can be used, namely, the total number of 
5 multiplications. This quantity represents a much more complex operation than do 
additions. In this algorithm, where most operations are of the nature "multiply- 
accumulate", the number of additions is of the same order as the number of 
multiplications and may thus be neglected without significant loss of accuracy. 

In order to reduce the computational complexity, focus is put on reducing 
1 0 the number of multiplications. The dominant contributions of multiplication 
operations are found in (24) and (34) which require P(P + 1)and P/2 (3P + 5) 

multiplications, respectively, considering certain symmetry properties of . 

Furthermore, one division is required in (34). However, according to (20), (24) is 
performed IK times per event while (34) is performed only once per event. 

Based on the above performance figures, a few approximations can 
be identified: 

o to use only the peak(s) in T(x(n)) instead of a complete grid search, 
o to use a simplified frj 1 , and 

o to use a likelihood based search sequence over the clusters, i.e., to 
start with the most likely cluster and to stop the search if a sufficiently 
small distance is found. 
Simplifying the grid search from spanning both samples and clusters 
in (20) to span over only clusters, 

d min=mindf(n) (38) 
i 

the number of multiplications may be reduced by a factor K to IP(P 
+1). Due to sensitivity in T(x(n)) , the feature vectors resulting in the peaks for two 
different events may differ significantly, this simplification is likely to result in more 
clusters. A useful compromise may instead be to use the coefficients from, e.g., 
the 3 largest peaks in T(x(n)) resulting in 3IP(P + 1) multiplications per event. 

Since a cardiac event lasts for longer time than one sample those 
samples which give the filter coefficients which are most similar to the cluster 
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15 



20 



15 

reference are determined. A grid search over 40 msec is preferably made. In this 
way coefficients of greatest importance locally are determined. For this decision 
T(x(n)) is considered and samples generating a peak are chosen, i.e. T(x(n-1)) < 
T(x(n)) > T(x(n+1)), since they indicate the probability for the presence of an event. 

Another simplification, based on the approximation of orthogonality 

between the different wavelet scales as well as the RR interval, is to simplify £r 1 

by only including its diagonal elements in the adaptation, 



Z -1 

I min 



£. 1 +(l-a)diag 

'min 



(s(k> T (k)) 



(39) 



where diag(A) returns the diagonal elements of the square matrix A. By using this 

approximation, the number of multiplications used for the estimation of is 

reduced to 3P. Additionally, the distance computation in (20) is simplified and may 
be reduced to 2IKP multiplications per event. 

A reasonable assumption is often that successive events originate 
from the same rhythms. Considering this knowledge, it would be sufficient to 
compute the distances for the previously selected cluster. In doing so, the number 
of multiplications in (38) are reduced even further. 

Table 1 presents the different detector versions as defined by their 
distinguishing features and shows computational complexity for the different 
versions of the clustering algorithm. 



Features 



Version 



A 1 

'min 



Search alg. 



Complexity C 



A 
B 

C 
D 



Full 
Diagonal 

Full 
Diagonal 



Interval 
Interval 

3 peak 
3 peak 



IKP(P + 1)+^ (3P + 5) 

2IKP+ 3P 
3IP(P + 1)+^ (3P + 5) 
6IP+ 3P 



The total computational complexity, C, as reflected by the number of 
multiplications for the different algorithm versions, is presented in Fig. 3 as a 
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function of I. In figure 3 the solid line shows an algorithm version A, dashed line a 
version B, dotted line a version C, and a version D by dash-dotted line), for a 
feature vector with (a), P = 4, and (b), P = 7 elements. 

5 Results 

The results are obtained by studying the performance of the following 

quantities: 

o algorithm versions, as presented in Table 1 , and 
o noise tolerance, for noise-free signals and for signals with 
1 0 background noise of 50pV RMS. 

The following parameter settings have been used (unless otherwise 

stated), 

a = 1.05 y = 0.025 5 = 50 K = 40 (49) 

15 

It is noted that a" 1 is chosen to offer faster adaption than y. The reason 

for that is that the initial estimate is likely to be less accurate than the initial 

estimate pj . Also, 8 is chosen to have the same order of magnitude as the steady 

state eigenvalues . 

20 It should be noted that the different algorithm versions are not fully 

comparable in terms of performance for a specific n due to the differences in 

distance computation. In versions B and D, where a diagonal is used, the 

lack of non-diagonal information results in a nonorthogonal distance which is 
larger than the orthogonal one. Since versions C and D make use of a limited 
25 search, the minimum distance found may differ from the global minimum 

distance for the event. Both these algorithmic differences imply an increase in 
clustering quality for a certain value of r), however, at the expense of more 
introduced clusters. 



30 



Evaluation measures 

The two main quantities evaluated are clustering performance and 
computational complexity, see Fig. 3; these two quantities are in general 
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conflicting. In the evaluation, a cluster is assigned to that cardiac rhythm which 
contains the most events in the cluster. This rhythm is denoted as the dominant 
rhythm within the cluster. With respect to the dominant rhythm, the cluster is 
defined as a correct cluster, whereas it is erroneous to all other rhythms. If a 
5 rhythm is found to be dominant in more than one cluster, such clusters are first 
merged in the performance evaluation. The number of events in the correct cluster 
which belongs to the i:th dominant rhythm is equal to N D (i), while the number of 
events belonging to any other false rhythm in the cluster is equal to N F (i). The 
number of events of the dominant rhythm which are not classified in a correct 
10 cluster, i.e., missing, is equal to N M (i). 

. The performance of the algorithm is evaluated in terms of probability 
of a correct clustering of an event, Pcc(i), and probability of a dominant event in a 
cluster, P D e(0. The first parameter is expressed as the share of correctly clustered 
events within a rhythm and is, using the above parameters, defined by 

N D (0 + N M (i) 

The second parameter may be expressed as the share of dominant 
rhythm events within a cluster, and is defined as 

p -« = nJ^> (42) 

For the case when a rhythm completely lacks a correct cluster, PdeO) 
20 is undefined; the rhythm is then excluded from subsequent statistical 

computations. Averaging P C c(i) andP D E(i) over all clusters results in the global 
performance measures Pec and P D e, respectively. 

It should be pointed out that Pcc(i) and Pde(0 reach their maximal 
value of 1 when n is sufficiently small such that the number of clusters equals the 
25 total number of beats evaluated. This is, of course, a highly undesirable solution 
although performance, as expressed in (41) and (42), will be excellent. For this 
reason, the total number of clusters, I, is a crucial parameter to be considered. 
• Here, the average T is used together with the minimum and maximum number of 
clusters for a case, l m j n , and l max , respectively. 
30 The power consumption of the algorithm is an important parameter 

which determines the pacemaker life span. In this study, power consumption is 
approximated by the computational complexity defined above as the total number 
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of multiplications. As shown, the computational complexity depends mainly on 
three parameters: the feature vector length, P, the search interval, K and the 
cluster size, I. 



5 Noise-free signal clustering performance 

Figure 4 presents clustering performance in terms of P D e and P C c , 
see Fig. 4 (a), and l min , Tand l ma x , see Fig. 4 (b). Thus in figure 4(a) clustering 
performance is shown as depending on T for Pec (dark bars) and Pde (bright bars) 
for noise-free signals. In figure 4(b), the spread between the different cases is 
1 0 shown as, from left to right, Un, T and l max . The presented algorithm versions are 
found in Table 1 and three values of T have been chosen for comparison; 3, 4 and 
5. Versions A and B perform similarly for all three cases, and achieve P D e = 1 and 
Pec = 1 for T = 4 and T = 5, respectively, by creating an acceptable number of 
clusters. However, it can be seen from Fig. 4 (b) that a large difference in the 
15 number of initialized clusters between different cases is present For version B, a 
slight increase in both T and I max is observed for T = 4. However, this increase 
is more due to an unfortunate step-like behaviour in the results for the given T than 
for any significant decrease in performance. The values of n used in Fig. 4 for the 
different versions are shown in Table 2. 
20 Table 2: Values of n used in Fig. 4. 

Algorithm version 



T 


A 


B 


C 


D 


5 


3.6 


4.2 


5.4 


5.8 


4 


4.6 


4.8 


7.0 


7.2 


3 


9.6 


9.6 


12.0 


10.8 



Versions C and D perform slightly worse than do versions A and B. 
Contrary to the latter ones, neither version achieves P D e = 1 or P C c = 1 for the 
presented T, see figure 4(a). 

25 Figure 5 (a) presents the clustering performance for version A in detail 

and its dependence on r\. The most noteworthy result is that both Pde= 1 and P C c 
= 1 for n < 7. It is also clear that clustering performance deteriorates rapidly for rj 
>10. The increase in P C c for very high values of n is due to that not all rhythms are 
allocated to a dominant cluster and are therefore disregarded in the performance 

30 computation. 
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In Fig. 5 (b), I min , T and Imaxare presented as a function of the 
clustering threshold. Thus figure 5(a) shows clustering performance in terms of 
P DE (solid line) and P C c (dashed line) as a function of n for noise-free signals, and 
figure 5(b) the corresponding l m i ni T and l max . For n < 3, the number of initiated 
5 clusters increases rapidly with decreasing r\. Within the interval 3 < n <7, all 
rhythms initiate at least one cluster, while for q >7, this is not true for all cases. 
Removing the "worst case", the above is true for the other cases for n <1 0. 

The clustering performance is exemplified for one set of parameters in 
Fig. 6 using n = 7. Correct clustering with minimal number of clusters is achieved 

10 for two cases while an extra cluster is initialized for two cases. The reason for the 
extra cluster is that, for case 2, the temporal search interval is chosen too small for 
the third morphology from the left resulting in an extra cluster. For case 5 an actual 
difference in morphologies both on the up and down slopes of the dominant peak 
is discernible between the first and fourth clusters from the left. 

15 Clustering for a concatenated electrogram is presented for case 3 in 

Fig. 7, where three distinct rhythm classes can be discerned, viz. normal sinus 
rhythm followed by supraventricular tachycardia and atrial flutter. The EGM is 
shown together with the clusters, represented by o, x and +, respectively, assigned 
for each event. The different rhythms result in three clusters. 

20 Figure 8 is a flow chart illustrating the function in broad outline of an 

embodiment of the apparatus according to the invention. At block 2 cardiac event 
features are extracted in the form of wavelet coefficients, and the event is 
detected, at 4, 6. At block 8 is checked whether the detected event is member of a 
labelled cluster. If so, the event is added to a class, at 10, and actions associated 

25 with that class are performed, at 12. 

If the event is not member of a labelled class it is checked if it is a 
member of an existing cluster, at 14. If so, the event is added to a class, at 16, and 
it is checked if it is possible to label the cluster, 18, and if so the cluster is labelled, 
at 20. 

30 If the event is not a member of an existing cluster, block 14, a new 

cluster is created as described above fitted to the detected cardiac event, at 22. 

Thus according to the invention clustering events in the EGM is 
performed for use in implantable CRM devices, like heart stimulators. The 
invention is based on feature extraction in the wavelet domain whereupon the 
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features are clustered based on the Mahalanobis distance criterion. According to 
advantageous embodiments of the apparatus according to the invention 
simplifications of the technique is proposed in order to reduce computational 
complexity to obtain a more impiementationally feasible solution. 
5 If the apparatus according to the invention is to be used for longer 

periods of data analysis, large clusters, although old, may be desirable to be kept 
in some way, while the oldest cluster may be selected to be removed if the 
application is based on shorter time frames. Also, due to the short data lengths, 
testing of such algorithms would be of limited value. 
10 By combining detector/clusterer with labelling rules of a classifier a 

complete detector/classifier is obtained with the possibility to more accurate 
therapy. 

The labelling need not be done in real time and probably more than 
one event will be needed to label a cluster. Once the cluster has been labelled 

15 using clinical terms, the actions associated with the particular class will be carried 
out immediately, i.e. in real time. Thus, the rules needed to label the cluster are 
not used in identifying the event itself. 

The rules used to label the clusters are based on characteristics of the 
different possible events. Instead of the exact rules, the characteristics are 

20 consequently described. 

Figure 9 is a block diagram of an embodiment of an implantable 
heart stimulator provided with the apparatus according to the invention. Electrodes 
30, 32 implanted in the heart 34 of a patient are connected by a lead 36 to an A/D 
converter 40, via a switch 38 serving as overvoltage protection for the A/D 

25 converter 40. In the A/D converter 40 the signal is A/D converted and the digital 
signal is supplied to a wavelet detector 42. 

The detector 42 decides whether a cardiac event is present or not as 
described earlier. Wavelet coefficients are calculated as well. Parameters of the 
detector 42 are programmable from the stimulator microprocessor 44. At the 

30 detection the coefficients and the RR information are forwarded to the clusterer 46 
in which it is determined to which cluster the detected cardiac event belongs, as 
described previously. The clusterer 46 is preferably of a leader-follower type and 
also the cluster parameters are programmable from the microprocessor 44. 
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By the microprocessor 44 suitable therapy is decided depending on 
assigned cluster for the detected event and possible a priori knowledge about 
arrhythmia associated with the cluster in question. Thus it is possible to distinguish 
e.g. ventricular tachycardia from a sinus tachycardia by comparing the parameters 
5 with a known normal sinus rhythm. Parameters of the sinus tachycardia are then 
supposed to be similar to those of a normal sinus rhythm, whereas parameters of 
ventricular tachycardia differ significantly. 

As an alternative the decision rules can be trained from a number of 
rhythms and the resulting rules are then used on test data, see Weichao Xu et al., 
1 0 "New Bayesian Discriminator for Detection of Atrial Tachyarrhythmias", 

DOI:10.1161/01.CIR.0000012349.14270.54, pp.1472-1479, January 2002. It is 
then possible to decide that a certain position of the cluster indicates e.g. a sinus 
rhythm, etc. Also this technique can be based on analysis of the feature vector for 
a cluster, and it is possible to decide if the beat is broad or narrow, large or small, 
15 or if the rhythm is regular or irregular. 

The stimulator in figure 9 also includes a pulse unit 48 with associated 
battery 50 for delivery of stimulation pulses to the patient's heart 34 depending on 
the clustering evaluation. 

The implantable stimulator shown in figure 9 includes telemetry means 52 
20 with antenna 54 for communication with external equipment, like a programmer. 



25 
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